Clustering Multivariate Data Streams by Correlating Attributes using Fractal Dimension
نویسندگان
چکیده
A data stream is a flow of data produced continuously along the time. Storing and analyzing such information become challenging due to exponential growth of the data volume collected. Recently, some algorithms have been proposed to cluster data streams as a whole, but just few of them deal with multivariate data streams. Even so, these algorithms merely aggregate the attributes without touching upon the correlation among them. Aiming to overcome this issue, we propose a new framework to cluster multivariate data streams based on their evolving behavior over time, exploring the correlations among their attributes by computing the fractal dimension. In order to evaluate our framework we used real multisource and multidimensional climate data streams. Our results show that the clusters’ quality and compactness can be improved compared to the competing methods, leading to the thoughtfulness that attributes correlations cannot be put aside. In fact, the clusters’ compactness are 14 to 25 times better using our method. Also our framework was 3 to 20 times faster than our competitors. Our framework also proves to be an useful tool to assist meteorologists in understanding the climate behavior along a period of time.
منابع مشابه
Improving Multivariate Data Streams Clustering
Clustering data streams is an important task in data mining research. Recently, some algorithms have been proposed to cluster data streams as a whole, but just few of them deal with multivariate data streams. Even so, these algorithms merely aggregate the attributes without touching upon the correlation among them. In order to overcome this issue, we propose a new framework to cluster multivari...
متن کاملClustering Multivariate Climate Data Streamsusing Fractal Dimension
A data stream is a flow of data produced continuously along the time. Storing and analyzing such information become challenging due to exponential growth of the data volume collected. In this context, some methods were proposed to cluster data streams with similar behavior along the time. However, those methods have failed on clustering data flows with more than one attribute, i.e., multivariat...
متن کاملParticle Swarm Optimized Optimal Threshold Value Selection for Clustering based on Correlation Fractal Dimension
The work on the paper is focused on the use of Fractal Dimension in clustering for evolving data streams. Recently Anuradha et al. proposed a new approach based on Relative Change in Fractal Dimension (RCFD) and damped window model for clustering evolving data streams. Through observations on the aforementioned referred paper, this paper reveals that the formation of quality cluster is heavily ...
متن کاملClustering based on correlation fractal dimension over an evolving data stream
Online clustering, in an evolving high dimensional data is an amazing challenge for data mining applications. Although, many clustering strategies have been proposed, it is still an exciting task since the published algorithms fail to do well with high dimensional datasets, finding arbitrary shaped clusters and handling outliers. Knowing fractal characteristics of dataset can help abstract the ...
متن کاملAnalysis of Resting-State fMRI Topological Graph Theory Properties in Methamphetamine Drug Users Applying Box-Counting Fractal Dimension
Introduction: Graph theoretical analysis of functional Magnetic Resonance Imaging (fMRI) data has provided new measures of mapping human brain in vivo. Of all methods to measure the functional connectivity between regions, Linear Correlation (LC) calculation of activity time series of the brain regions as a linear measure is considered the most ubiquitous one. The strength of the dependence obl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- JIDM
دوره 7 شماره
صفحات -
تاریخ انتشار 2016